Overview

Dataset statistics

Number of variables14
Number of observations15866
Missing cells623
Missing cells (%)0.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.7 MiB
Average record size in memory112.0 B

Variable types

NUM10
CAT4

Warnings

brewery_name has a high cardinality: 1753 distinct values High cardinality
review_profilename has a high cardinality: 5309 distinct values High cardinality
beer_style has a high cardinality: 103 distinct values High cardinality
beer_name has a high cardinality: 6526 distinct values High cardinality
beer_abv has 621 (3.9%) missing values Missing
Unnamed: 0 has unique values Unique

Reproduction

Analysis started2020-11-04 01:58:29.517251
Analysis finished2020-11-04 01:58:48.489668
Duration18.97 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

Unnamed: 0
Real number (ℝ≥0)

UNIQUE

Distinct15866
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean792166.7627
Minimum12
Maximum1586529
Zeros0
Zeros (%)0.0%
Memory size124.0 KiB
2020-11-03T18:58:48.652040image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile77447.25
Q1399948
median791572.5
Q31193125.5
95-th percentile1506580.5
Maximum1586529
Range1586517
Interquartile range (IQR)793177.5

Descriptive statistics

Standard deviation458183.8024
Coefficient of variation (CV)0.5783931161
Kurtosis-1.203072527
Mean792166.7627
Median Absolute Deviation (MAD)396346.5
Skewness-0.001417095701
Sum1.256851786e+10
Variance2.099323967e+11
MonotocityNot monotonic
2020-11-03T18:58:48.810776image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2990061< 0.1%
 
15404471< 0.1%
 
13831261< 0.1%
 
15203391< 0.1%
 
9264171< 0.1%
 
2833431< 0.1%
 
11312131< 0.1%
 
14343131< 0.1%
 
7482321< 0.1%
 
14814141< 0.1%
 
Other values (15856)1585699.9%
 
ValueCountFrequency (%) 
121< 0.1%
 
1191< 0.1%
 
2241< 0.1%
 
3041< 0.1%
 
3431< 0.1%
 
ValueCountFrequency (%) 
15865291< 0.1%
 
15863641< 0.1%
 
15863521< 0.1%
 
15863431< 0.1%
 
15862481< 0.1%
 

brewery_id
Real number (ℝ≥0)

Distinct1769
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3183.496596
Minimum1
Maximum27800
Zeros0
Zeros (%)0.0%
Memory size124.0 KiB
2020-11-03T18:58:48.972523image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile30
Q1142
median423
Q32391
95-th percentile16866
Maximum27800
Range27799
Interquartile range (IQR)2249

Descriptive statistics

Standard deviation5689.24471
Coefficient of variation (CV)1.787105636
Kurtosis3.356527267
Mean3183.496596
Median Absolute Deviation (MAD)360
Skewness2.076065472
Sum50509357
Variance32367505.37
MonotocityNot monotonic
2020-11-03T18:58:49.117806image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
353902.5%
 
100993162.0%
 
1473162.0%
 
1402781.8%
 
1322311.5%
 
2872311.5%
 
11992071.3%
 
2201901.2%
 
3451901.2%
 
291661.0%
 
Other values (1759)1335184.1%
 
ValueCountFrequency (%) 
1100.1%
 
3470.3%
 
4790.5%
 
5140.1%
 
62< 0.1%
 
ValueCountFrequency (%) 
278001< 0.1%
 
276811< 0.1%
 
270871< 0.1%
 
2703980.1%
 
269901< 0.1%
 

brewery_name
Categorical

HIGH CARDINALITY

Distinct1753
Distinct (%)11.0%
Missing1
Missing (%)< 0.1%
Memory size124.0 KiB
Boston Beer Company (Samuel Adams)
 
390
Stone Brewing Co.
 
316
Dogfish Head Brewery
 
316
Sierra Nevada Brewing Co.
 
278
Rogue Ales
 
231
Other values (1748)
14334 
ValueCountFrequency (%) 
Boston Beer Company (Samuel Adams)3902.5%
 
Stone Brewing Co.3162.0%
 
Dogfish Head Brewery3162.0%
 
Sierra Nevada Brewing Co.2781.8%
 
Rogue Ales2311.5%
 
Bell's Brewery, Inc.2311.5%
 
Founders Brewing Company2071.3%
 
Lagunitas Brewing Company1901.2%
 
Victory Brewing Company1901.2%
 
Avery Brewing Company1661.0%
 
Other values (1743)1335084.1%
 
2020-11-03T18:58:49.308795image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique705 ?
Unique (%)4.4%
2020-11-03T18:58:49.488899image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length66
Median length23
Mean length23.69128955
Min length3

review_time
Real number (ℝ≥0)

Distinct15865
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1225493757
Minimum894931201
Maximum1326251972
Zeros0
Zeros (%)0.0%
Memory size124.0 KiB
2020-11-03T18:58:49.652703image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum894931201
5-th percentile1073805417
Q11174793682
median1241334608
Q31289233383
95-th percentile1318747448
Maximum1326251972
Range431320771
Interquartile range (IQR)114439701.2

Descriptive statistics

Standard deviation76132666.97
Coefficient of variation (CV)0.06212407576
Kurtosis-0.2562526226
Mean1225493757
Median Absolute Deviation (MAD)53092013
Skewness-0.7620441542
Sum1.944368394e+13
Variance5.79618298e+15
MonotocityNot monotonic
2020-11-03T18:58:49.801891image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
12908973102< 0.1%
 
11770388411< 0.1%
 
12178194671< 0.1%
 
13030838501< 0.1%
 
13190684891< 0.1%
 
11373842621< 0.1%
 
12327473301< 0.1%
 
13001449611< 0.1%
 
12588265601< 0.1%
 
12289155171< 0.1%
 
Other values (15855)1585599.9%
 
ValueCountFrequency (%) 
8949312011< 0.1%
 
9082368011< 0.1%
 
9173088011< 0.1%
 
9847872011< 0.1%
 
9939122461< 0.1%
 
ValueCountFrequency (%) 
13262519721< 0.1%
 
13262328701< 0.1%
 
13262272371< 0.1%
 
13262169531< 0.1%
 
13261853851< 0.1%
 

review_overall
Real number (ℝ≥0)

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.823490483
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size124.0 KiB
2020-11-03T18:58:49.927657image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.5
Q13.5
median4
Q34.5
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7112147028
Coefficient of variation (CV)0.186011893
Kurtosis1.627512023
Mean3.823490483
Median Absolute Deviation (MAD)0.5
Skewness-1.011864773
Sum60663.5
Variance0.5058263534
MonotocityNot monotonic
2020-11-03T18:58:50.033095image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
4591037.2%
 
4.5323120.4%
 
3.5302319.1%
 
3161210.2%
 
59185.8%
 
2.55813.7%
 
23732.4%
 
1.51220.8%
 
1960.6%
 
ValueCountFrequency (%) 
1960.6%
 
1.51220.8%
 
23732.4%
 
2.55813.7%
 
3161210.2%
 
ValueCountFrequency (%) 
59185.8%
 
4.5323120.4%
 
4591037.2%
 
3.5302319.1%
 
3161210.2%
 

review_aroma
Real number (ℝ≥0)

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.74454809
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size124.0 KiB
2020-11-03T18:58:50.152648image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.5
Q13.5
median4
Q34
95-th percentile4.5
Maximum5
Range4
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.6929546586
Coefficient of variation (CV)0.1850569526
Kurtosis1.257369698
Mean3.74454809
Median Absolute Deviation (MAD)0.5
Skewness-0.8668971118
Sum59411
Variance0.4801861588
MonotocityNot monotonic
2020-11-03T18:58:50.259768image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
4565135.6%
 
3.5367123.1%
 
4.5273717.3%
 
3192212.1%
 
2.56464.1%
 
56394.0%
 
24002.5%
 
1.51320.8%
 
1680.4%
 
ValueCountFrequency (%) 
1680.4%
 
1.51320.8%
 
24002.5%
 
2.56464.1%
 
3192212.1%
 
ValueCountFrequency (%) 
56394.0%
 
4.5273717.3%
 
4565135.6%
 
3.5367123.1%
 
3192212.1%
 

review_appearance
Real number (ℝ≥0)

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.845833859
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size124.0 KiB
2020-11-03T18:58:50.374091image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q13.5
median4
Q34
95-th percentile4.5
Maximum5
Range4
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.6081135223
Coefficient of variation (CV)0.1581226711
Kurtosis1.808656181
Mean3.845833859
Median Absolute Deviation (MAD)0.5
Skewness-0.9120150828
Sum61018
Variance0.369802056
MonotocityNot monotonic
2020-11-03T18:58:50.488132image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
4682743.0%
 
3.5317220.0%
 
4.5289018.2%
 
3165810.5%
 
56253.9%
 
2.53502.2%
 
22591.6%
 
1.5520.3%
 
1330.2%
 
ValueCountFrequency (%) 
1330.2%
 
1.5520.3%
 
22591.6%
 
2.53502.2%
 
3165810.5%
 
ValueCountFrequency (%) 
56253.9%
 
4.5289018.2%
 
4682743.0%
 
3.5317220.0%
 
3165810.5%
 

review_profilename
Categorical

HIGH CARDINALITY

Distinct5309
Distinct (%)33.5%
Missing1
Missing (%)< 0.1%
Memory size124.0 KiB
northyorksammy
 
61
Thorpe429
 
52
BuckeyeNation
 
52
mikesgroove
 
45
ChainGangGuy
 
43
Other values (5304)
15612 
ValueCountFrequency (%) 
northyorksammy610.4%
 
Thorpe429520.3%
 
BuckeyeNation520.3%
 
mikesgroove450.3%
 
ChainGangGuy430.3%
 
Billolick380.2%
 
TheManiacalOne380.2%
 
akorsak370.2%
 
drabmuh340.2%
 
Mora2000340.2%
 
Other values (5299)1543197.3%
 
2020-11-03T18:58:50.669426image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique2755 ?
Unique (%)17.4%
2020-11-03T18:58:50.839444image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length16
Median length9
Mean length8.975797302
Min length3

beer_style
Categorical

HIGH CARDINALITY

Distinct103
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size124.0 KiB
American IPA
 
1159
American Double / Imperial IPA
 
883
American Pale Ale (APA)
 
619
American Porter
 
550
American Double / Imperial Stout
 
524
Other values (98)
12131 
ValueCountFrequency (%) 
American IPA11597.3%
 
American Double / Imperial IPA8835.6%
 
American Pale Ale (APA)6193.9%
 
American Porter5503.5%
 
American Double / Imperial Stout5243.3%
 
Russian Imperial Stout5203.3%
 
American Amber / Red Ale4462.8%
 
Belgian Strong Dark Ale3562.2%
 
Fruit / Vegetable Beer3502.2%
 
Saison / Farmhouse Ale3352.1%
 
Other values (93)1012463.8%
 
2020-11-03T18:58:51.009889image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-11-03T18:58:51.177578image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length35
Median length18
Mean length17.81488718
Min length4

review_palate
Real number (ℝ≥0)

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.746785579
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size124.0 KiB
2020-11-03T18:58:51.298435image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.5
Q13.5
median4
Q34
95-th percentile4.5
Maximum5
Range4
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.6732922656
Coefficient of variation (CV)0.179698638
Kurtosis1.311415167
Mean3.746785579
Median Absolute Deviation (MAD)0.5
Skewness-0.8658062548
Sum59446.5
Variance0.4533224749
MonotocityNot monotonic
2020-11-03T18:58:51.405552image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
4615538.8%
 
3.5341221.5%
 
4.5250215.8%
 
3203312.8%
 
2.56314.0%
 
55993.8%
 
23662.3%
 
1.51080.7%
 
1600.4%
 
ValueCountFrequency (%) 
1600.4%
 
1.51080.7%
 
23662.3%
 
2.56314.0%
 
3203312.8%
 
ValueCountFrequency (%) 
55993.8%
 
4.5250215.8%
 
4615538.8%
 
3.5341221.5%
 
3203312.8%
 

review_taste
Real number (ℝ≥0)

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.802880373
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size124.0 KiB
2020-11-03T18:58:51.524027image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.5
Q13.5
median4
Q34.5
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7247920843
Coefficient of variation (CV)0.1905902929
Kurtosis1.358259339
Mean3.802880373
Median Absolute Deviation (MAD)0.5
Skewness-0.9601350005
Sum60336.5
Variance0.5253235655
MonotocityNot monotonic
2020-11-03T18:58:51.631353image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%) 
4536833.8%
 
4.5341321.5%
 
3.5329220.7%
 
3165710.4%
 
58615.4%
 
2.56384.0%
 
24122.6%
 
1.51280.8%
 
1970.6%
 
ValueCountFrequency (%) 
1970.6%
 
1.51280.8%
 
24122.6%
 
2.56384.0%
 
3165710.4%
 
ValueCountFrequency (%) 
58615.4%
 
4.5341321.5%
 
4536833.8%
 
3.5329220.7%
 
3165710.4%
 

beer_name
Categorical

HIGH CARDINALITY

Distinct6526
Distinct (%)41.1%
Missing0
Missing (%)0.0%
Memory size124.0 KiB
Old Rasputin Russian Imperial Stout
 
42
90 Minute IPA
 
35
Sierra Nevada Celebration Ale
 
34
India Pale Ale
 
33
Pale Ale
 
32
Other values (6521)
15690 
ValueCountFrequency (%) 
Old Rasputin Russian Imperial Stout420.3%
 
90 Minute IPA350.2%
 
Sierra Nevada Celebration Ale340.2%
 
India Pale Ale330.2%
 
Pale Ale320.2%
 
Stone Ruination IPA320.2%
 
Sierra Nevada Pale Ale310.2%
 
Two Hearted Ale300.2%
 
Ayinger Celebrator Doppelbock300.2%
 
Founders KBS (Kentucky Breakfast Stout)290.2%
 
Other values (6516)1553897.9%
 
2020-11-03T18:58:51.799796image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Frequencies of value counts

Unique

Unique3951 ?
Unique (%)24.9%
2020-11-03T18:58:51.973823image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length74
Median length19
Mean length20.56271272
Min length2

beer_abv
Real number (ℝ≥0)

MISSING

Distinct254
Distinct (%)1.7%
Missing621
Missing (%)3.9%
Infinite0
Infinite (%)0.0%
Mean7.045513283
Minimum0.05
Maximum41
Zeros0
Zeros (%)0.0%
Memory size124.0 KiB
2020-11-03T18:58:52.136599image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum0.05
5-th percentile4.5
Q15.2
median6.5
Q38.5
95-th percentile11
Maximum41
Range40.95
Interquartile range (IQR)3.3

Descriptive statistics

Standard deviation2.320866046
Coefficient of variation (CV)0.3294104989
Kurtosis10.5706658
Mean7.045513283
Median Absolute Deviation (MAD)1.5
Skewness1.762830752
Sum107408.85
Variance5.386419202
MonotocityNot monotonic
2020-11-03T18:58:52.512326image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
510786.8%
 
87054.4%
 
66424.0%
 
76163.9%
 
96013.8%
 
5.55613.5%
 
105323.4%
 
6.54803.0%
 
5.24342.7%
 
7.54332.7%
 
Other values (244)916357.8%
 
(Missing)6213.9%
 
ValueCountFrequency (%) 
0.051< 0.1%
 
0.451< 0.1%
 
0.54< 0.1%
 
1.21< 0.1%
 
1.51< 0.1%
 
ValueCountFrequency (%) 
412< 0.1%
 
322< 0.1%
 
272< 0.1%
 
262< 0.1%
 
19.51< 0.1%
 

beer_beerid
Real number (ℝ≥0)

Distinct6752
Distinct (%)42.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22027.5
Minimum5
Maximum76814
Zeros0
Zeros (%)0.0%
Memory size124.0 KiB
2020-11-03T18:58:52.670567image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile219
Q11769
median14925.5
Q339621
95-th percentile62999
Maximum76814
Range76809
Interquartile range (IQR)37852

Descriptive statistics

Standard deviation21902.53081
Coefficient of variation (CV)0.9943266739
Kurtosis-0.8454913006
Mean22027.5
Median Absolute Deviation (MAD)14149.5
Skewness0.6717315153
Sum349488315
Variance479720855.8
MonotocityNot monotonic
2020-11-03T18:58:52.835143image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
412420.3%
 
2093350.2%
 
1904340.2%
 
4083320.2%
 
276310.2%
 
1093300.2%
 
131300.2%
 
680290.2%
 
19960290.2%
 
92270.2%
 
Other values (6742)1554798.0%
 
ValueCountFrequency (%) 
52< 0.1%
 
62< 0.1%
 
75< 0.1%
 
81< 0.1%
 
91< 0.1%
 
ValueCountFrequency (%) 
768141< 0.1%
 
768131< 0.1%
 
768031< 0.1%
 
767561< 0.1%
 
767261< 0.1%
 

Interactions

2020-11-03T18:58:32.871602image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:33.183486image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:33.325390image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:33.464015image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:33.646095image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:33.812301image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:33.960974image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:34.109441image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:34.257326image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:34.392067image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:34.540807image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:34.674828image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:34.797490image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:34.918262image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:35.050037image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:35.182103image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:35.314855image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:35.447107image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:35.578556image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:35.697123image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:35.830415image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:35.962696image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:36.083730image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:36.203435image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:36.333896image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:36.464801image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:36.595541image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:36.726185image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:36.960477image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:37.079267image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:37.211336image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:37.359848image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:37.496719image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:37.632613image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:37.781149image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:37.928171image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:38.074961image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:38.221106image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:38.367956image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:38.500899image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:38.648744image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:38.797853image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:38.934922image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:39.069629image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:39.226269image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:39.372809image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:39.519004image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:39.664682image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:39.813902image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:39.946854image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:40.093899image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:40.244468image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:40.423169image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:40.560106image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:40.706930image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:40.854231image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:41.001577image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:41.148096image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:41.295360image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:41.428169image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:41.702521image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:41.854747image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:41.992563image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:42.128303image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:42.276332image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:42.422706image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:42.573237image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:42.722774image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:42.886045image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:43.019876image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:43.170067image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:43.318399image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:43.463418image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:43.604446image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:43.750978image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:43.897296image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:44.043864image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:44.190718image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:44.343051image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:44.475355image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:44.622674image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:44.751943image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:44.870293image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:44.987063image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:45.113396image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:45.240801image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:45.374596image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:45.505496image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:45.632179image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:45.748378image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:45.877611image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:46.025500image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:46.161681image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:46.296819image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:46.442102image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:46.598061image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:46.758537image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:46.914071image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:47.062756image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:47.199301image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Correlations

2020-11-03T18:58:52.992350image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-11-03T18:58:53.190846image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-11-03T18:58:53.391083image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-11-03T18:58:53.588319image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2020-11-03T18:58:47.726034image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:48.045223image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:48.246061image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/
2020-11-03T18:58:48.354034image/svg+xmlMatplotlib v3.3.1, https://matplotlib.org/

Sample

First rows

Unnamed: 0brewery_idbrewery_namereview_timereview_overallreview_aromareview_appearancereview_profilenamebeer_stylereview_palatereview_tastebeer_namebeer_abvbeer_beerid
0123217210001Hockley Valley Brewing Co.12688635223.53.53.0MattyVIrish Dry Stout2.03.0Hockley Stout4.635859
1521244113Samuel Smith Old Brewery (Tadcaster)12094357864.54.04.5bluegrassbrewEnglish Porter4.04.0Samuel Smith's, The Famous Taddy Porter5.0572
21098847418Left Hand Brewing Company13097209853.53.53.5DrJayEnglish India Pale Ale (IPA)3.53.5400 Pound Monkey6.744706
3246137607High Point Brewing Company10636486794.54.54.0DantesMärzen / Oktoberfest4.54.0Ramstein Oktoberfest6.012718
41260943112North Coast Brewing Co.12694089944.04.03.5nickflGerman Pilsener3.53.5Scrimshaw Pilsner4.4409
5151912073521st Amendment Brewery13101153003.53.53.5rootbeermanAmerican Double / Imperial IPA3.03.5Hop Crisis9.742063
613502947944Ridgeway Brewing12613681794.03.53.5RichardbergForeign / Export Stout4.04.0Lump Of Coal8.020905
7218892158Great Divide Brewing Company13228829474.04.04.0SolipsismalCatOld Ale4.04.0Hibernation Ale8.71446
81057389200Mendocino Brewing Company12270705303.54.04.0PatrickJRAmerican Barleywine3.53.0Talon - True Style Barley Wine Ale10.516439
914721821086Cleveland ChopHouse And Brewery11700906534.54.04.0mattcrillBelgian Strong Pale Ale4.04.0Hop DiabloNaN35010

Last rows

Unnamed: 0brewery_idbrewery_namereview_timereview_overallreview_aromareview_appearancereview_profilenamebeer_stylereview_palatereview_tastebeer_namebeer_abvbeer_beerid
158561013003143Spoetzl Brewery12354284013.53.53.5Vengeance526Vienna Lager3.53.5Shiner 98 Bavarian Style AmberNaN36866
1585767439014400Ninkasi Brewing Company12506388834.03.04.0morimechAmerican Pale Ale (APA)4.03.5Radiant Summer Ale6.050271
158581239111494Malt Shovel Brewery10922806863.54.04.0ZorroEnglish Porter3.53.5James Squire Porter5.02078
1585928290535Boston Beer Company (Samuel Adams)12501690534.04.04.0ghebbVienna Lager4.04.0Samuel Adams Boston Lager4.9104
1586026358068Flying Dog Brewery12846721783.03.54.5EricCioeAmerican Double / Imperial IPA4.04.5Double Dog Double Pale Ale11.535754
1586169060466South African Breweries plc13095521884.03.54.5katanMilk / Sweet Stout3.53.5Castle Milk Stout6.07371
1586261838210097Harpoon Brewery13098364842.02.04.0SWMeyer4141American Pale Wheat Ale1.52.0UFO Hefeweizen5.1318
15863904258142Spaten-Franziskaner-Bräu12971774614.54.04.0CuriousMonkHefeweizen4.04.5Franziskaner Hefe-Weisse5.01946
15864103568639Privatbrauerei Franz Inselkammer KG / Brauerei Aying12863361514.03.03.5TheQuietMan22Dortmunder / Export Lager3.54.0Ayinger Jahrhundert Bier5.5133
158651135428863Russian River Brewing Company12974479505.04.54.5SupaCeltAmerican Double / Imperial IPA4.54.5Pliny The Elder8.07971